Design and Implementation of a Library Metadata Management Framework and its Application in Fuzzy Data Deduplication and Data Reconciliation with Authority Data

نویسنده

  • Martin Czygan
چکیده

We describe the application of a generic workflow management system to the problem of metadata processing in the library domain. The requirements for such a framework and acting real-world forces are examined. The design of the framework is layed out and illustrated by means of two example workflows: fuzzy data deduplication and data reconciliation with authority data. Fuzzy data deduplication is the process of finding similar items in record collections that can not be matched by identifiers. Data reconciliation with authority data takes a data source and enhances the metadata of its records – mainly authors and subjects – with available normed data. Finally, the advantages and tradeoffs of the presented approach are discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology

Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...

متن کامل

On-Line Nonlinear Dynamic Data Reconciliation Using Extended Kalman Filtering: Application to a Distillation Column and a CSTR

Extended Kalman Filtering (EKF) is a nonlinear dynamic data reconciliation (NDDR) method. One of its main advantages is its suitability for on-line applications. This paper presents an on-line NDDR method using EKF. It is implemented for two case studies, temperature measurements of a distillation column and concentration measurements of a CSTR. In each time step, random numbers with zero m...

متن کامل

Study of the foundation, models and issues of research data curation and management in scientific and academic environments

Background and Aim: The purpose of this paper is to study, identifying and discuss the foundation and concepts, models and frameworks, dimensions and challenges of research data curation and management in scientific and academic environments. Method: This article is a review article and library method was used to collect scientific and research texts in this field. In this research, external an...

متن کامل

بررسی پایگاه های کتاب الکترونیکی با تاکید بر ابر داده

Introduction: With the exponential growth of electronic resources on the Web, the application of metadata has enhanced the precision of retrieval and facilitated the search of electronic resources. Hence, the aim of this study was to determine the application of metadata in e-book databases. Methods: This study is an applied work, which was carried out through survey methods in 2013. The pop...

متن کامل

A multi-period distribution network design model under demand uncertainty

Supply chain management is taken into account as an inseparable component in satisfying customers' requirements. This paper deals with the distribution network design (DND) problem which is a critical issue in achieving supply chain accomplishments. A capable DND can guarantee the success of the entire network performance. However, there are many factors that can cause fluctuations in input dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014